Signal Separation Motivated by Human Auditory Perception: Applications to Automatic Speech Recognition

نویسنده

  • M. Stern
چکیده

The human auditory system uses a number of well-identified cues to segregate and separate individual sound sources in a complex acoustical environment. For example, researchers in auditory scene analysis have long identified cues such as common onset, correlated fluctuations in instantaneous amplitude and frequency, harmonicity, and common interaural time and amplitude differences as ways of identifying which components of a complex signal are derived from a common source. It is widely believed that the use of these cues to achieve such “grouping” and signal separation should be very useful in improving the accuracy of automatic speech recognition in very difficult environments such as competing speech, background music, and transient noise, and this has been a goal of several research groups in computational auditory scene analysis. This talk describes and discusses several ways in which signals can be separated using physiologically-motivated cues, along with the potential benefit to be derived from such separation for automatic speech recognition.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Database for Automatic Persian Speech Emotion Recognition: Collection, Processing and Evaluation

Abstract   Recent developments in robotics automation have motivated researchers to improve the efficiency of interactive systems by making a natural man-machine interaction. Since speech is the most popular method of communication, recognizing human emotions from speech signal becomes a challenging research topic known as Speech Emotion Recognition (SER). In this study, we propose a Persian em...

متن کامل

Applying physiologically-motivated models of auditory processing to automatic speech recognition

For many years the human auditory system has been an inspiration for developers of automatic speech recognition systems because of its ability to interpret speech accurately in a wide variety of difficult acoustical environments. This paper discusses the application of physiologically-motivated approaches to signal processing that facilitate robust automatic speech recognition in environments w...

متن کامل

Effect of signal to noise ratio on the speech perception ability of older adults

Background: Speech perception ability depends on auditory and extra-auditory elements. The signal-to-noise ratio (SNR) is an extra-auditory element that has an effect on the ability to normally follow speech and maintain a conversation. Speech in noise perception difficulty is a common complaint of the elderly. In this study, the importance of SNR magnitude as an extra-auditory effect on speech...

متن کامل

Signal Processing for Robust Speech Recognition Motivated by Auditory Processing

Although automatic speech recognition systems have dramatically improved in recent decades, speech recognition accuracy still significantly degrades in noisy environments. While many algorithms have been developed to deal with this problem, they tend to be more effective in stationary noise such as white or pink noise than in the presence of more realistic degradations such as background music,...

متن کامل

An automatic speech recognition system based on the scene analysis account of auditory perception

Despite many years of concentrated research, the performance gap between automatic speech recognition (ASR) and human speech recognition (HSR) remains large. The difference between ASR and HSR is particularly evident when considering the response to additive noise. Whereas human performance is remarkably robust, ASR systems are brittle and only operate well within the narrow range of noise cond...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1993